Skip to content

DL4H Task Implementation for Patient Summarization#1012

Open
vishal-iemcal wants to merge 3 commits intosunlabuiuc:masterfrom
vishal-iemcal:master
Open

DL4H Task Implementation for Patient Summarization#1012
vishal-iemcal wants to merge 3 commits intosunlabuiuc:masterfrom
vishal-iemcal:master

Conversation

@vishal-iemcal
Copy link
Copy Markdown

Contributor name and NetID: Vishal Vyas vyas9 (vyas9@illinois.edu)
Type of contribution: Standalone Task
Original Paper: A Data-Centric Approach To Generate Faithful and High Quality
Patient Summaries with Large Language Models
Link to Original Paper: [https://arxiv.org/abs/2402.15422]
High-level description of implementation: Implements a new standalone task for extracting the patient note samples using the MIMIC - IV Note dataset. This dataset is then cleaned further and used to train Large Language Models to generate patient summaries and detect hallucinations.

File guide listing

New Files
pyhealth/tasks/discharge_note_summarization.py A new Task for extracting patient discharge summary and text samples from MIMIC4-Note dataset. Task also performs some additional processing on the patient discharge events to extract brief hospital course, text , subject id and hadm id.

tests/core/test_discharge_note_summarization.py: Test cases for the discharge_note_summarization.py file

docs/api/tasks/pyhealth.tasks.DischargeNoteSummarization.rst
examples/discharge__summary_samples.ipynb: Note book showing examples of how the Task is used to extract patient summaries and additional cleaning is performed to generate the data set for training LLMs - GPT 4 and Llama 70B

Modified Files
docs/api/tasks.rst: Updated the index
pyhealth/tasks/init.py Registered the Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant